Goto

Collaborating Authors

 green cube



Can Visuo-motor Policies Benefit from Random Exploration Data? A Case Study on Stacking

arXiv.org Artificial Intelligence

Human demonstrations have been key to recent advancements in robotic manipulation, but their scalability is hampered by the substantial cost of the required human labor. In this paper, we focus on random exploration data-video sequences and actions produced autonomously via motions to randomly sampled positions in the workspace-as an often overlooked resource for training visuo-motor policies in robotic manipulation. Within the scope of imitation learning, we examine random exploration data through two paradigms: (a) by investigating the use of random exploration video frames with three self-supervised learning objectives-reconstruction, contrastive, and distillation losses-and evaluating their applicability to visual pre-training; and (b) by analyzing random motor commands in the context of a staged learning framework to assess their effectiveness in autonomous data collection. Towards this goal, we present a large-scale experimental study based on over 750 hours of robot data collection, comprising 400 successful and 12,000 failed episodes. Our results indicate that: (a) among the three self-supervised learning objectives, contrastive loss appears most effective for visual pre-training while leveraging random exploration video frames; (b) data collected with random motor commands may play a crucial role in balancing the training data distribution and improving success rates in autonomous data collection within this study. The source code and dataset will be made publicly available at https://cloudgripper.org.


Automatic Behavior Tree Expansion with LLMs for Robotic Manipulation

arXiv.org Artificial Intelligence

Robotic systems for manipulation tasks are increasingly expected to be easy to configure for new tasks or unpredictable environments, while keeping a transparent policy that is readable and verifiable by humans. We propose the method BEhavior TRee eXPansion with Large Language Models (BETR-XP-LLM) to dynamically and automatically expand and configure Behavior Trees as policies for robot control. The method utilizes an LLM to resolve errors outside the task planner's capabilities, both during planning and execution. We show that the method is able to solve a variety of tasks and failures and permanently update the policy to handle similar problems in the future.


Developmental Scaffolding with Large Language Models

arXiv.org Artificial Intelligence

Exploratoration and self-observation are key mechanisms of infant sensorimotor development. These processes are further guided by parental scaffolding accelerating skill and knowledge acquisition. In developmental robotics, this approach has been adopted often by having a human acting as the source of scaffolding. In this study, we investigate whether Large Language Models (LLMs) can act as a scaffolding agent for a robotic system that aims to learn to predict the effects of its actions. To this end, an object manipulation setup is considered where one object can be picked and placed on top of or in the vicinity of another object. The adopted LLM is asked to guide the action selection process through algorithmically generated state descriptions and action selection alternatives in natural language. The simulation experiments that include cubes in this setup show that LLM-guided (GPT3.5-guided) learning yields significantly faster discovery of novel structures compared to random exploration. However, we observed that GPT3.5 fails to effectively guide the robot in generating structures with different affordances such as cubes and spheres. Overall, we conclude that even without fine-tuning, LLMs may serve as a moderate scaffolding agent for improving robot learning, however, they still lack affordance understanding which limits the applicability of the current LLMs in robotic scaffolding tasks.


NonCompositional

#artificialintelligence

Written in a rush, because time flies like an arrow (whereas fruit flies like a banana). Each entry is also a chain of Tweets. When we compose meanings, concepts, semantics or any other'elements' of cognition, the outcome is not easily predictable like it is when we compose functions in mathematics or operations in a computer programme. We all know, without really even having to think, that a wine hangover is a hangover caused by wine, but a college town is a town that has a college. It seems obvious to us that a honey bee is a bee that produces honey, but that a mountain lodge is a lodge located on a mountain.


Representation Matters: Improving Perception and Exploration for Robotics

arXiv.org Artificial Intelligence

Projecting high-dimensional environment observations into lower-dimensional structured representations can considerably improve data-efficiency for reinforcement learning in domains with limited data such as robotics. Can a single generally useful representation be found? In order to answer this question, it is important to understand how the representation will be used by the agent and what properties such a 'good' representation should have. In this paper we systematically evaluate a number of common learnt and hand-engineered representations in the context of three robotics tasks: lifting, stacking and pushing of 3D blocks. The representations are evaluated in two use-cases: as input to the agent, or as a source of auxiliary tasks. Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement. We can significantly improve performance in both use-cases and demonstrate that some representations can perform commensurate to simulator states as agent inputs. Finally, our results challenge common intuitions by demonstrating that: 1) dimensionality strongly matters for task generation, but is negligible for inputs, 2) observability of task-relevant aspects mostly affects the input representation use-case, and 3) disentanglement leads to better auxiliary tasks, but has only limited benefits for input representations. This work serves as a step towards a more systematic understanding of what makes a 'good' representation for control in robotics, enabling practitioners to make more informed choices for developing new learned or hand-engineered representations.


Meet 'Wattam,' The Newest Absurd Video Game Playground From Keita Takahashi

NPR Technology

The Mayor, a green cube with a top hat, goes "kaboom" in Wattam. The Mayor, a green cube with a top hat, goes "kaboom" in Wattam. The video game designer Keita Takahashi is best known for Katamari Damacy, released in 2004. It's about a god named the "King of All Cosmos" who, while drunk, accidentally destroys the stars in the sky. His son "The Prince" is left to clean up his mess by rolling up objects on Earth into sticky masses that grow so large they become new stars.


AI monitoring system aims to optimize hemp crops

#artificialintelligence

A Polish start-up has developed the first application in a suite of artificial intelligence (AI) tools dedicated to crop monitoring, yield optimization and management of outdoor hemp fields. "We'll be able to accurately predict when flowers will be perfect for a harvest. That's important not only because cannabinoids content can drop more than 35% if collected too late, but it's also critical from a logistics point of view," said Marcin Marczak, CEO at the developer, Green Cube Solutions. "With the limitations on available equipment, good planning is the key to a successful harvest." The technology being developed by Łódź-based Green Cube is an "integrated end-to-end platform that will support farmers from soil preparation through harvest and up to product distribution," Marczak said.